Learning GP-trees from Noisy Data
نویسندگان
چکیده
We discuss the problem of model selection in Genetic Programming using the framework provided by Statistical Learning Theory, i.e. Vapnik-Chervonenkis theory (VC). We present empirical comparisons between classical statistical methods (AIC, BIC) for model selection and the Structural Risk Minimization method (based on VC-theory) for symbolic regression problems. Empirical comparisons of different methods for model selection suggest practical advantages of using VC-based model selection when using genetic training. keywords: Model selection, genetic programming, symbolic regression
منابع مشابه
Penalty Functions for Genetic Programming Algorithms
Very often symbolic regression, as addressed in Genetic Programming (GP), is equivalent to approximate interpolation. This means that, in general, GP algorithms try to fit the sample as better as possible but no notion of generalization error is considered. As a consequence, overfitting, code-bloat and noisy data are problems which are not satisfactorily solved under this approach. Motivated by...
متن کاملApplication of Genetic Programming toInduction of Linear Classi cation
A common problem in datamining is to nd accurate classi-ers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classiication problems. Using GP, we are able to induce decision trees with a linear combination of variables in each function node. A new representation of decision trees using Strong Typing in GP is introduced. With this representation it is no...
متن کاملLearning from Multiple Annotators with Gaussian Processes
In many supervised learning tasks it can be costly or infeasible to obtain objective, reliable labels. We may, however, be able to obtain a large number of subjective, possibly noisy, labels from multiple annotators. Typically, annotators have different levels of expertise (i.e., novice, expert) and there is considerable diagreement among annotators. We present a Gaussian process (GP) approach ...
متن کاملImproving Induction of Linear Classification Trees with Genetic Programming
A common problem in datamining is to nd accurate classiiers for a dataset. For this purpose, genetic programming (GP) is applied to a set of benchmark classiication problems. Using GP, we are able to induce decision trees with a linear combination of variables in each function node. A new representation of decision trees using Strong Typing in GP was introduced in Bot and Langdon, 2000]. The ee...
متن کاملApplication of Genetic Programming to Induction of Linear Classi cation Trees
A common problem in datamining is to nd accurate classiiers for a dataset. For this purpose, genetic programming (GP) is applied to a benchmark of classiication problems. In particular, using GP we are able to induce decision trees with a linear combination of variables in each function node. The eeects of techniques as limited error tness, tness sharing Pareto scoring and domination Pareto sco...
متن کامل